Search CORE

127 research outputs found

A Universal Parallel Two-Pass MDL Context Tree Compression Algorithm

Author: Baron Dror
Krishnan Nikhil
Publication venue
Publication date: 21/03/2015
Field of study

Computing problems that handle large amounts of data necessitate the use of lossless data compression for efficient storage and transmission. We present a novel lossless universal data compression algorithm that uses parallel computational units to increase the throughput. The length-

N

input sequence is partitioned into

B

blocks. Processing each block independently of the other blocks can accelerate the computation by a factor of

B

, but degrades the compression quality. Instead, our approach is to first estimate the minimum description length (MDL) context tree source underlying the entire input, and then encode each of the

B

blocks in parallel based on the MDL source. With this two-pass approach, the compression loss incurred by using more parallel units is insignificant. Our algorithm is work-efficient, i.e., its computational complexity is

O(N/B)

. Its redundancy is approximately

B\log(N/B)

bits above Rissanen's lower bound on universal compression performance, with respect to any context tree source whose maximal depth is at most

\log(N/B)

. We improve the compression by using different quantizers for states of the context tree based on the number of symbols corresponding to those states. Numerical results from a prototype implementation suggest that our algorithm offers a better trade-off between compression and throughput than competing universal data compression algorithms.Comment: Accepted to Journal of Selected Topics in Signal Processing special issue on Signal Processing for Big Data (expected publication date June 2015). 10 pages double column, 6 figures, and 2 tables. arXiv admin note: substantial text overlap with arXiv:1405.6322. Version: Mar 2015: Corrected a typ

arXiv.org e-Print Archive

A Parallel Two-Pass MDL Context Tree Algorithm for Universal Source Coding

Author: Baron Dror
Krishnan Nikhil
Mıhçak Mehmet Kıvanç
Publication venue
Publication date: 01/01/2014
Field of study

We present a novel lossless universal source coding algorithm that uses parallel computational units to increase the throughput. The length-

N

input sequence is partitioned into

B

blocks. Processing each block independently of the other blocks can accelerate the computation by a factor of

B

, but degrades the compression quality. Instead, our approach is to first estimate the minimum description length (MDL) source underlying the entire input, and then encode each of the

B

O(N/B)

. Its redundancy is approximately

B\log(N/B)

bits above Rissanen's lower bound on universal coding performance, with respect to any tree source whose maximal depth is at most

\log(N/B)

arXiv.org e-Print Archive

CiteSeerX

Crossref

Empirical Bayes and Full Bayes for Signal Estimation

Author: Baron Dror
Krishnan Nikhil
Ma Yanting
Tan Jin
Publication venue
Publication date: 08/05/2014
Field of study

We consider signals that follow a parametric distribution where the parameter values are unknown. To estimate such signals from noisy measurements in scalar channels, we study the empirical performance of an empirical Bayes (EB) approach and a full Bayes (FB) approach. We then apply EB and FB to solve compressed sensing (CS) signal estimation problems by successively denoising a scalar Gaussian channel within an approximate message passing (AMP) framework. Our numerical results show that FB achieves better performance than EB in scalar channel denoising problems when the signal dimension is small. In the CS setting, the signal dimension must be large enough for AMP to work well; for large signal dimensions, AMP has similar performance with FB and EB.Comment: This work was presented at the Information Theory and Application workshop (ITA), San Diego, CA, Feb. 201

arXiv.org e-Print Archive

CiteSeerX

A Study on the Impact of Locality in the Decoding of Binary Cyclic Codes

Author: Barg Alexander
Krishnan M. Nikhil
Kumar P. Vijay
Puranik Bhagyashree
Tamo Itzhak
Publication venue
Publication date: 13/02/2017
Field of study

In this paper, we study the impact of locality on the decoding of binary cyclic codes under two approaches, namely ordered statistics decoding (OSD) and trellis decoding. Given a binary cyclic code having locality or availability, we suitably modify the OSD to obtain gains in terms of the Signal-To-Noise ratio, for a given reliability and essentially the same level of decoder complexity. With regard to trellis decoding, we show that careful introduction of locality results in the creation of cyclic subcodes having lower maximum state complexity. We also present a simple upper-bounding technique on the state complexity profile, based on the zeros of the code. Finally, it is shown how the decoding speed can be significantly increased in the presence of locality, in the moderate-to-high SNR regime, by making use of a quick-look decoder that often returns the ML codeword.Comment: Extended version of a paper submitted to ISIT 201

arXiv.org e-Print Archive

Crossref

Open Access Repository of IISc Research Publications

Rate-Optimal Streaming Codes for Channels with Burst and Isolated Erasures

Author: Krishnan M. Nikhil
Kumar P. Vijay
Publication venue
Publication date: 17/01/2018
Field of study

Recovery of data packets from packet erasures in a timely manner is critical for many streaming applications. An early paper by Martinian and Sundberg introduced a framework for streaming codes and designed rate-optimal codes that permit delay-constrained recovery from an erasure burst of length up to

B

. A recent work by Badr et al. extended this result and introduced a sliding-window channel model

\mathcal{C}(N,B,W)

. Under this model, in a sliding-window of width

W

, one of the following erasure patterns are possible (i) a burst of length at most

B

or (ii) at most

N

(possibly non-contiguous) arbitrary erasures. Badr et al. obtained a rate upper bound for streaming codes that can recover with a time delay

T

, from any erasure patterns permissible under the

\mathcal{C}(N,B,W)

model. However, constructions matching the bound were absent, except for a few parameter sets. In this paper, we present an explicit family of codes that achieves the rate upper bound for all feasible parameters

N

B

W

and

T

.Comment: shorter version submitted to ISIT 201

arXiv.org e-Print Archive

Crossref

Open Access Repository of IISc Research Publications

Sequential Gradient Coding For Straggler Mitigation

Author: Ebrahimi MohammadReza
Khisti Ashish
Krishnan M. Nikhil
Publication venue
Publication date: 24/11/2022
Field of study

In distributed computing, slower nodes (stragglers) usually become a bottleneck. Gradient Coding (GC), introduced by Tandon et al., is an efficient technique that uses principles of error-correcting codes to distribute gradient computation in the presence of stragglers. In this paper, we consider the distributed computation of a sequence of gradients

\{g(1),g(2),\ldots,g(J)\}

, where processing of each gradient

g(t)

starts in round-

t

and finishes by round-

(t+T)

. Here

T\geq 0

denotes a delay parameter. For the GC scheme, coding is only across computing nodes and this results in a solution where

T=0

. On the other hand, having

T>0

allows for designing schemes which exploit the temporal dimension as well. In this work, we propose two schemes that demonstrate improved performance compared to GC. Our first scheme combines GC with selective repetition of previously unfinished tasks and achieves improved straggler mitigation. In our second scheme, which constitutes our main contribution, we apply GC to a subset of the tasks and repetition for the remainder of the tasks. We then multiplex these two classes of tasks across workers and rounds in an adaptive manner, based on past straggler patterns. Using theoretical analysis, we demonstrate that our second scheme achieves significant reduction in the computational load. In our experiments, we study a practical setting of concurrently training multiple neural networks over an AWS Lambda cluster involving 256 worker nodes, where our framework naturally applies. We demonstrate that the latter scheme can yield a 16\% improvement in runtime over the baseline GC scheme, in the presence of naturally occurring, non-simulated stragglers

arXiv.org e-Print Archive